Skip to content

Conversation

puretension
Copy link

What does this PR do?

Adds DD_DASHBOARD_FORCE_SYNC_PERIOD environment variable to configure DatadogDashboard controller sync frequency, following the same pattern as DD_MONITOR_FORCE_SYNC_PERIOD.

Motivation

Fixes #2179 - DatadogDashboard controller has a hardcoded 60-minute force sync period, unlike DatadogMonitor which supports DD_MONITOR_FORCE_SYNC_PERIOD. This means dashboard drift from manual UI changes can persist for up to 60 minutes.

Additional Notes

  • Implementation follows the exact same pattern as DatadogMonitor controller
  • Defaults to 60 minutes if environment variable is not set
  • Includes error handling for invalid values
  • All existing tests pass

Minimum Agent Versions

No minimum version requirements - this is an operator-only change.

  • Agent: N/A
  • Cluster Agent: N/A

Describe your test plan

  • Verified existing DatadogDashboard tests still pass
  • Tested environment variable parsing logic with valid/invalid values
  • Confirmed lint and build checks pass

Checklist

  • PR has at least one valid label: enhancement
  • PR has a milestone or the qa/skip-qa label

@puretension puretension requested a review from a team as a code owner September 18, 2025 15:13
@puretension
Copy link
Author

Hi! Could someone please add the 'enhancement' label and either a milestone or the 'qa/skip-qa' label to this PR?
The CI is failing due to missing labels/milestone.

@levan-m levan-m added the enhancement New feature or request label Sep 18, 2025
@levan-m levan-m added this to the v1.20.0 milestone Sep 18, 2025
Copy link

@orkhanM orkhanM left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should any documentation be updated?

logger.Info("DatadogDashboard manifest has changed")
shouldUpdate = true
} else if instance.Status.LastForceSyncTime == nil || ((defaultForceSyncPeriod - now.Sub(instance.Status.LastForceSyncTime.Time)) <= 0) {
} else if instance.Status.LastForceSyncTime == nil || ((forceSyncPeriod - now.Sub(instance.Status.LastForceSyncTime.Time)) <= 0) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@puretension Thanks for putting this together! It looks good I think but it seems like this doesn't follow the same pattern as the monitors do exactly. I haven't dived too deeply into the patterns used in other crd controllers here but the Monitor controller seems to check the Monitor specific MonitorLastForceSyncTime rather than the generic LastForceSyncTime:

} else if instance.Status.MonitorLastForceSyncTime == nil || (forceSyncPeriod-now.Sub(instance.Status.MonitorLastForceSyncTime.Time)) <= 0 {
// Periodically force a sync with the API monitor to ensure parity
// Get monitor to make sure it exists before trying any updates. If it doesn't, set shouldCreate
m, err = r.get(instance, newStatus)
if err != nil {
logger.Error(err, "error getting monitor", "Monitor ID", instance.Status.ID)
if strings.Contains(err.Error(), ctrutils.NotFoundString) {
shouldCreate = true
}
} else {
shouldUpdate = true
}
} else if instance.Status.MonitorStateLastUpdateTime == nil || (defaultRequeuePeriod-now.Sub(instance.Status.MonitorStateLastUpdateTime.Time)) <= 0 {
// If other conditions aren't met, and we have passed the defaultRequeuePeriod, then update monitor state
// Get monitor to make sure it exists before trying any updates. If it doesn't, set shouldCreate
m, err = r.get(instance, newStatus)
if err != nil {
logger.Error(err, "error getting monitor", "Monitor ID", instance.Status.ID)
if strings.Contains(err.Error(), ctrutils.NotFoundString) {
shouldCreate = true
}
}
updateMonitorState(m, now, newStatus)
}
}

The MonitorLastForceSyncTime seems to be set whenever a monitor is synced. I'm not sure if having a Monitor specific variable there was intentional or not or if the maintainers care about consistency in this case but thought it'd be worthwhile to bring to your attention.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@orkhanM Great catch! You're absolutely right about the consistency issue.

Looking at the Monitor controller pattern you referenced,
I can see it uses MonitorLastForceSyncTime for the force sync check.
I'll update the Dashboard implementation to use DashboardLastForceSyncTime
to match this pattern for consistency across controllers.

Thanks for pointing this out!

@puretension
Copy link
Author

Should any documentation be updated?

I don't think any documentation updates are needed for this change
since it's an internal API field rename for consistency with the Monitor controller pattern.
But I'm not entirely sure😅 would appreciate the maintainers' input on this!

@dd-octo-sts
Copy link

dd-octo-sts bot commented Oct 9, 2025

This pull request has been automatically marked as stale because it has not had activity in the past 15 days.

It will be closed in 30 days if no further activity occurs. If this pull request is still relevant, adding a comment or pushing new commits will keep it open. Also, you can always reopen the pull request if you missed the window.

Thank you for your contributions!

@dd-octo-sts dd-octo-sts bot added the stale label Oct 9, 2025
Copy link
Member

@tbavelier tbavelier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello @puretension ,

Thank you for the contribution! While I understand the CRD was changed to be consistent with the monitor one, such changes could break customers that do not update the CRD while updating the operator by trying to access a field that does not exist. Moreover, changing a CRD (_types.go) should cause more files to be changed (make generate should show these changes).

Could you please remove the last commit ? I think docs/datadog_dashboard.md could also benefit from this change being documented.

Thanks!

@tbavelier tbavelier modified the milestones: v1.20.0, v1.21.0 Oct 17, 2025
@puretension puretension force-pushed the add-dashboard-force-sync-period-env branch from d91dec2 to d7ce92c Compare October 17, 2025 12:50
@puretension puretension requested a review from a team as a code owner October 17, 2025 12:50
@puretension
Copy link
Author

Hi @tbavelier,

Thank you for the detailed feedback and for explaining the compatibility concerns with CRD changes. I've removed the last commit that modified the CRD to avoid potential issues for customers who might update the operator without updating the CRD. I also added documentation for the DD_DASHBOARD_FORCE_SYNC_PERIOD environment variable in docs/datadog_dashboard.md as you suggested.

I appreciate you taking the time to explain why the CRD change could be problematic and the importance of running make generate when modifying types. Please let me know if there's anything else I missed or if I should make any other adjustments to the implementation.

Thanks again for your guidance!

Copy link
Member

@tbavelier tbavelier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code is looking fine, could you please update the documentation based on my comment ? Thanks in advance

…_FORCE_SYNC_PERIOD in step 3

Signed-off-by: puretension <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request stale

Projects

None yet

Development

Successfully merging this pull request may close these issues.

DatadogDashboard - add configurable DD_DASHBOARD_FORCE_SYNC_PERIOD environment variable

4 participants